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Introduction 

The  role  of  genetics  in  cancer  is  now  well  established  with  the  identification  of  several 
gaies  where  the  presence  of  mutation  has  been  associated  with  cancer  formation  (1).  It  has  been 
shown  that  mutations  of  tumor  suppressor  genes,  as  negative  regulators  of  cell  division, 
contribute  to  oncogenesis  by  interference  with  mechanisms  restraining  cell  multiplication.  Thus 
genes  encoding  proteins  involved  in  cellular  functions  including  signal  transduction,  transcription 
and  phosphorylation/dephosphorylation  cell  cycle  pathways  are  prime  candidates  for  tumor 
suppressor  genes.  In  familial  forms  of  cancer,  a  combination  of  germ-line  and  somatic  mutations 
on  each  allele  results  in  chromosome  loss  or  deletion,  mdotic  recombination,  or  gene  conversion. 
Similar  events  uncovering  recessive  somatic  mutations  also  occur  in  sporadic  forms  of  cancer.  The 
loss  of  gaietic  material  iidierited  from  one  parent  can  be  detected  by  loss  of  heterozygosity  (LOH) 
analysis  using  genetic  markers.  LOH  studies  on  tumors  and  linkage  analysis  in  inherited  forms  of 
cancers  have  resulted  in  identification  of  several  tumor  suppressor  genes  (2).  The  gene  for 
hereditary  breast  cancer,  namely  BRCA  1  is  also  associated  with  hereditary  cancer  of  the  ovary 
(2,3).  The  gene  for  BRCA  1  was  mapped  to  chromosome  17q21,  a  region  that  is  also  associated 
with  allele  losses  (loss  of  heterozygosity,  (LOH))  in  sporadic  breast  and  ovarian  cancer 
(4,5,6,7,8,9).  After  intensive  investigation  by  many  research  groups,  the  gene  BRCA  1  was 
identified  via  positional  cloning  methods  (10 ).  Many  mutations  in  BRCA  1  gene  were  found  in 
patients  with  hereditary  breast  and  ovarian  cancer  (1 1,12).  Surprisingly,  these  studies  also  show 
that  mutations  in  BRCAl  are  rare  in  sporadic  breast  and  ovarian  cancers  that  are  thought  to  be  due 
to  susceptibility  to  the  disease  at  this  locus  (13,14,15).  Together,  the  LOH  studies  and  the  lack  of 
mutation  in  BRCA  1  have  led  to  the  propose  that  there  is  another  gene  within  this  region  of  17ql2- 
q22  that  is  associated  with  sporadic  breast  and  ovarian  cancer  in  women  (16).  Many  studies  have 
demonstrated  LOH  in  othCT  regions  of  human  chromosome  17  associated  with  breast  cancer 
(17,18).  These  studies  indicate  that  a  region  telomeric  to  the  P53  gene  at  17pl3.3  (about  3  cM)  is 
believ^  to  harbor  a  separate  tumor  suppressor  gene  associated  with  breast  cancer.  In  the 
meantime,  another  study  also  shows  that  regions  17q24-25  are  associated  with  another  tumor 
suppressor  gene  (8,16).  The  challenge  to  identify  these  potential  sporadic  breast  cancer  genes  on 
the  chromosome  17  is  expected  to  be  great.  The  strategies  for  cloning  a  disease-related  gaie 
included  either  functional  or  positional  cloning  approaches  (19).  With  the  effort  of  the  Human 
Genome  Initiative  in  cDNA  and  expressed  seq^uence  tag  (EST)  mapping,  a  candidate  gaie  approach 
to  finding  human  disease  genes  has  been  predicted  to  be  the  future  trend.  Our  research  interest  is 
focused  on  the  development  of  new  strategies  to  idaitify  genes  from  chromosome  17.  The 
isolation  of  genes  transcribed  from  chromosome  17  will  provide  candidates  for  the  proposed 
sporadic  breast  and  ovarian  canco*  genes.  The  next  phase  of  this  proposal  is  to  find  candidate 
tumor  suppressor  genes  associated  with  sporadic  breast  cancer  in  the  regions  of  17pl3, 17ql2-22, 
and  17q24-25  by  combining  candidate,  functional  and  positional  cloning  strategies. 


Results 


We  have  reported  a  method  for  the  isolation  of  chromosome  specific  cDNAs  using  high 
density  arrayed  cDNA  and  chromosome  specific  cosmid  libraries  (20).  The  ability  to  isolate  genes 
in  a  chromosome  specific  manner  provides  simultaneous  identification  of  the  expressed  sequence 
and  a  chromosomal  location.  This  new  technology  identifies  expressed  sequences  by  reciprocal 
probing  of  arrayed  cDNA  libraries  and  a  chromosome  specific  cosmid  library. 

The  isolated  chromosome  specific  cDNA  clones  were  sequenced  through  one  pass 
sequencing  from  the  5'  and  3'ends.  The  corresponding  cosmids  were  used  for  fluorescent  in  situ 
hybridization  (FISH)  mapping  to  localize  their  chromosomal  position.  The  sequence  information 
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was  used  to  generate  sequence  tag  site  (STS)  primers  for  polymerase  chain  reaction  (PCR) 
mapping  on  chromosome  17  somatic  hybrid  cell-lines  to  ftirther  confirm  the  cDNA  and  the 
corresponding  cosmid  map  position. 

During  our  last  report  we  described  our  research  on  a  human  placental  cDNA  library  where 
we  have  isolated  and  characterized  42  cDNAs  to  chromosome  17.  In  the  past  year  we  have 
completed  our  goal  of  arraying  40,000  clones  from  a  placental  and  an  ovarian  cDNA  libraries.  The 
gaieration  of  FCR  products  for  high  density  filters  from  these  libraries  have  also  been  completed. 
This  phase  of  the  project  is  on  target  with  our  statement  of  work.  Our  preliminaty  analysis  of  the 
human  ovarian  library  indicates  that  the  level  of  gene  div^sity  is  not  high.  This  is  not  surprising 
since  the  ovary  is  a  highly  specialized  organ.  To  reduce  redundancy  of  clones  and  to  increase  the 
level  of  gene  dvCTsity  we  elected  to  work  with  a  human  heart  tissue  libraty.  In  line  with  this  we 
carried  out  the  arraying  of  a  20,(X)0  clone  human  heart  cDNA  library.  Initial  characterization  of  this 
library  suggested  a  relatively  high  level  of  gene  diversity  and  we  elected  to  proceed  with  this  library 
for  our  next  level  of  stucfy.  Probes  from  20,(X)0  heart  cDNAs  w^e  used  to  screai  a  human 
chromosome  17  cosmid  library.  A  total  of  732  clones  from  the  chromosome  17  cosmid  library 
were  idaitified  to  contain  expressed  sequoices.  Keeping  in  mind  that  the  cosmid  libraries  were 
generated  with  a  5-10  X  coverage  of  the  chromosome,  this  number  of  cosmids  could  be  associated 
with  about  60-80  potential  genes.  Indeed  our  analysis  of  these  cosmids  gave  63  unique  cDNAs  in 
addition  to  those  found  in  the  placenta  studies.  As  can  be  seen  from  our  ^o^ess  we  have 
exceeded  our  goals  for  the  first  24  months.  We  are  in  the  of  process  identifying  cosmids  associated 
with  expressed  s^uence  for  probes  from  the  ovarian  library. 

So  far  using  our  approach,  105  unique  cDNAs  of  chromosome  17  have  been  identified. 
Each  of  the  cDNAs  identified  in  this  study  was  sequenced  from  the  5’  and  3’  ends  of  the  clone. 
Sequences  were  searched  against  the  Genome  data  base  to  determine  homologies  to  existing  genes 
of  toown  function.  From  these  searches,  we  were  able  to  identify  human  genes  that  have  not  bear 
described.  These  include  genes  that  are  novel,  i.e.  no  homology  in  the  data  base  and  sequences 
with  homolo^  to  expres!^  sequaice  tags  (EST).  Homology  to  non-human  DNA  and  protein 
sequence  motifs  w^e  also  included  in  the  group  of  genes  to  be  studied.  cDNAs  with  sequence 
homolo^  to  characterized  human  genes  were  excluded  from  further  studies.  43  of  the  105  cDNAs 
we  identified  are  eitho-  novel  or  have  homology  to  ESTs  in  the  data  base;  16  cDNAs  have 
previously  been  described  and  mapped  to  chromosome  17;  the  remaining  46  cDNAs  are 
homologous  to  previously  described  but  unmapped  genes  ( see  Table  1).  These  cDNAs  can  be 
further  grouped  into  their  respective  mapped  position  on  human  chromosome  17.  We  mapped  8 
cDNAs  to  the  r^ion  17pl3,  44  cDNAs  to  the  region  of  17ql2-22,  and  20  cDNAs  to  the  region  of 
17q24-25.  The  distribution  of  the  105  characterized  cDNA  is  shown  in  Figure  1.  Togetha,  there 
are  72  cDNAs  localized  in  the  regions  of  17pl3, 17ql2-22  and  17q24-25,  three  regions  targeted 
in  this  stu^. 


Table.l  .  Identified  cDNAs  from  Chromosome  17 


#  of  cDNA  previously 
mapped  to  chromosome  17 

#  of  cDNAs  with  Known 
Function  but  not  mapped 

#  of  cDNAs  with  Unkown 
Function 

16 

46 

43 
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Given  that  there  are  72  cDNAs  mapped  to  the  targeted  regions  we  have  to  devise  selection 
criteria  for  taking  these  cDNAs  for  detailed  characterization.  This  is  essaitially  our  plan  of  research 
from  month  18-48.  We  propose  to  carry  out  the  following  strategies  for  analysis  of  these  clones  as 
candidates  for  tumor  suppressor  genes. 

1).  Isolate  full-length  cDNAs  of  the  genes  localized  in  the  LOH  region  by  screaiing 
specific  primed  cDNA  libraries,  and  obtain  the  full-length  DNA  sequence  by  automatic 
fluorescence  sequaicing  method.  Based  on  a  cDNA’s  sequence  information  we  can  determine  its 
potential  biological  function  by  its  homology  to  existing  protein  motifs.  The  encoded  amino  acid 
sequences  will  be  searched  against  protein  databases.  The  rapid  development  of  computational 
methods  and  search  tools  have  greatly  facilitated  the  biologist  to  compare  and  determine  the 
function  of  novel  gaies.  Recently  our  department  has  developed  a  new  search  tool,  namely 
BEAUTY,  to  compare  the  novel  gene  with  genes  of  known  function  in  databases  (21).  Unlike 
conventional  search  methods  this  search  tool  is  focused  on  the  functional  domain  similarity 
comparison,  which  gives  us  the  valuable  information  about  possible  functions  of  novel  genes. 
Genes  with  protein  motifs  that  falls  under  the  following  cellular  functions  including  signal 
transduction,  transcription  and  phosphorylation/dqphosphorylation  cell  cycle  pathways  will  be 
chosen  for  f^her  studies.  Once  identified,  the  construction  of  the  gene  structure  and  exon-intron 
boundary  by  primer  directed  sequaicing  of  cosmids  associated  with  their  corresponding  cDNAs 
'  will  be  carried  out.  Although  we  have  mapped  72  cDNAs  to  the  regions  of  17pl3, 17ql2-22,  and 
17q24-25  by  FISH,  we  still  do  not  know  if  these  cDNAs  are  localized  in  the  narrower  regions 
where  sporadic  breast  cancer  patients  showed  LOH.  In  order  to  answer  this  qu^ion,  we  need  to 
find  the  polymorphic  markar  related  to  these  cDNAs.  The  cosmids  associated  with  the  cDNAs  will 
be  digested  with  a  combination  of  restriction  enzymes  into  around  500  bp  fragments.  The  digested 
fragments  will  be  subcloned  into  M13  vector  and  screened  by  a  (CA)13  repeat  oligonucleotide 
probe.  The  positive  clones  will  be  sequenced.  The  level  of  heterozygosity  for  the  miaosattehte 
repeat  is  determine  by  PCR  based  typing  on  genomic  DNA  of  twenty  individuals  with  primers 
derived  from  the  sequences  flanking  the  (CA)n  rq)eat.  Heterozygosity  larger  than  0.6  will  be 
considered  as  a  polymorphic  marko*.  In  similar  line  we  will  use  the  existing  LOH  miCTOsatellite 
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markers  to  determine  if  th^  are  located  in  the  cosmids  associated  with  our  gaies.  DNA  from 
breast  tumor  cell  lines  will  be  used  for  typing  the  LOH  miCTOsatellite  obtained  from  our  genes.  This 
will  determine  whether  any  of  these  genes  are  located  in  the  LOH  regions.  Mutation  detection  will 
be  PCR  based  single  strand  conformation  polymorphism  (SSCP),  heteroduplex  analysis,  and 
chanical  mismatch  cleavage  methodologies.  Sequaice  the  mutat^  region  by  PCR  amplification  of 
the  corresponding  genomic  DNA  for  confirmation. 

2).  We  would  like  to  embark  on  an  approach  parallel  with  from  the  strategy  described 
above.  This  involved  the  screening  of  genes  for  microsatellite  (dinucleotide  and  trinucleotide) 
repeat  sequences.  Over  the  past  six  years,  8  human  diseases  have  been  discovered  that  are 
associated  with  trinucleotide  repeat  expansion  (22).  The  trinucleotide  repeat  expansion  mechanism 
has  provided  a  solution  to  the  non-mendelian  gaieties  clinically  termed  anticipation.  Anticipation  is 
used  in  heritable  diseases  to  describe  the  increasing  incidence  of  a  disease  in  a  family  from 
genaation  to  genoation  as  well  as  the  earlier  onset  of  the  disease  in  individuals  from  generation  to 
generation.  Simple  trinucleotide  repeats  are  now  provai  to  be  the  molecular  origin  for 
anticipation  in  seven  of  these  haitable  diseases.  They  include  spinobulbar  muscular  atrophy 
(SBMA),  fragile  X  syndrome  (FX),  Myotonic  ^strophy  (MD),  Huntington’s  chorea  disease 
(HD),  Spinocerebellar  ataxia  type  I  (SCA),  Dentatorubral-pallidolusian  atrophy  (DRPLA),  and 
Machado-Joseph  disease  (MJD).  The  exception  to  this  rule  was  the  recently  identified  gene  for 
Friedreich  ataxia  which  is  autosomal  recessive  without  genetic  anticipation.  A  novel  trinucleotide 
repeat  expansion  of  GAA  within  an  intron  sequence  was  discovered  to  cause  Friedreich  ataxia. 

This  recait  discovay  suggest  otho*  variations  of  microsatellite  sequences  are  involved  in  other 
human  diseases.  The  majority  of  breast  cancer  are  non-Mendelian,  i.e  sporadic  form.  Thus  this 
mechanism  based  on  miCTOsatellite  instability  is  an  excellait  idea  for  explaining  sporadic  human 
diseases  including  breast  cancer.  In  fact,  colon  cancers  due  to  mutation  in  DNA  mismatch  repair 
genes  have  a  high  level  of  miaosatellite  instability  (23).  Our  goal  is  to  identify  genes  with  these 
miaosatellite  features  and  to  type  them  on  breast  cell  line  DNA  to  determine  the  presence  of 
aberrant  alleles.  The  aberrant  alleles  can  be  idaitified  from  normal  laigth  polymorphism  by 
comparing  the  alleles  from  disease  state  against  non-disease  state. 

Conclusions 

The  profwsal  goals  for  the  past  year  focused  on  the  development  of  resources  and  the 
application  of  this  technology  so  that  gaies  specific  to  chromosome  17  could  be  identified  rapidly. 
To  date  we  have  successfully  general^  the  resources  proposed  for  this  project  and  have  utilized 
them  for  the  isolation  of  105  unique  cDNAs.  In  addition  to  the  proposed  libraries  we  have  also 
added  a  human  heart  cDNA  library  to  this  study.  To  date  we  have  kept  up  with  the  5’  and  3’ 
sequencing  of  the  cDNAs  isolated.  The  information  obtained  from  the  5’  and  3’  ends  of  these 
cDNAs  has  allowed  for  the  search  against  the  Genome  data  base  for  sequence  homology.  Of  the 
characterized  cDNAs,  43  of  these  cDNAs  are  dther  novel  or  have  homology  to  ESTs  in  the  data 
base;  16  cDNAs  have  previously  been  described  and  mapped  to  chromosome  17;  the  remaining  46 
cDNAs  are  homologous  to  previously  described  but  unmapped  genes.  We  mapp^  8  cDNAs  to  the 
region  17pl3,  44  cDNAs  to  the  region  of  17ql2-22,  while  20  cDNAs  to  the  region  of  17q24-25, 
three  regions  frequaitly  observed  to  have  LOH  in  breast  cancer  tissue.  We  also  proposed  to  isolate 
the  full-length  cDNA  to  these  genes  and  the  genaation  of  genomic  DNA  material  from  disease 
material  for  LOH  studies. 

The  present  stuefy  demonstrates  that  this  novel  strategy  for  isolating  chromosome  specific 
gene  is  efficient.  The  reagaits  genaated  by  the  reciprocal  probing  strategy  including  cDNAs,  map 
cosmids,  and  sequence  tag  site  (STS)  primers  can  provide  a  high  level  of  transcript  map 
characterization.  The  isolation  and  mapping  of  chromosome  17  cDNAs  has  provided  candidates 
for  the  proposed  sporadic  breast  cancer  genes  and  other  human  diseases  mapped  to  this 
chromosome. 
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